#### ARM Instruction Set Architecture (II)

Lecture 6

Yeongpil Cho

Hanynag University

### **Topics**

- ARM Assembly Instruction General Symbols
- ARM Arithmetic and Logic Instructions

# **ARM Assembly Instruction**

# From C to Assembly



#### Load-Store Architecture

- Instructions are divided into two categories:
  - memory access operations
    - between memory and registers
  - ALU operations
    - between registers



#### Load-Store Architecture



# Assembly Instructions Supported

- Arithmetic and logic
  - Add, Subtract, Multiply, Divide, Shift, Rotate
- Data movement
  - Load, Store, Move
- Compare and branch
  - Compare, Test, If-then, Branch, compare and branch on zero
- Miscellaneous
  - Breakpoints, wait for events, interrupt enable/disable, data memory barrier, data synchronization barrier

#### **ARM Instruction Format**

mnemonic operand1, operand2, operand3



#### **ARM Instruction Format**

#### mnemonic operand1, operand2, operand3

- Mnemonic represents the operation to be performed.
- The number of operands varies, depending on each specific instruction. Some instructions have no operands at all.
  - Typically, operand1 is the destination register, and operand2 and operand3 are source operands.
  - operand2 is usually a register
  - operand3 may be a <u>register</u>, an <u>immediate number</u>, a register shifted to a constant amount of bits, or a register plus an offset (used for memory access).

#### **ARM Instruction Format**

mnemonic operand1, operand2, operand3

• Examples: Variants of the ADD instruction

```
ADD r1, r2, r3 ; r1 = r2 + r3

ADD r1, r3 ; r1 = r1 + r3

ADD r1, r2, #4 ; r1 = r2 + 4

ADD r1, #15 ; r1 = r1 + 15
```

# First Assembly

```
/* Equates symbol to value */
    .equ STACK TOP, 0x20000800
                                   /* Tells AS to assemble region */
    .text
    .syntax unified
                                   /* Means language is ARM UAL */
    .thumb
                                   /* Means ARM ISA is Thumb */
    .global start
                                   /* .global exposes symbol */
                अल्राह्य म्ड्यस अक्ष
                                   /* start label is the beginning */
                                   /* ...of the program region */
    .type start, %function
                                   /* Specifies start is a function */
                                   /* start label is reset handler */
start:
             STACK TOP, start
                                   /* Inserts word 0x20000800 */
    .word
                                   /* Inserts word (start) */
                                   /* start label */
start:
    movs r0, #10
    movs r1, #0
loop:
    adds r1, r0
    subs r0, #1
    bne loop
deadloop:
         deadloop
    .end
```

# Encoding 16-bit Thumb Instructions

| 15 | 14 | 13 | 12  | 11   | 10   | 9    | 8     | 7     | 6    | 5    | 4    | 3   | 2  | 1   | 0 |                                       |
|----|----|----|-----|------|------|------|-------|-------|------|------|------|-----|----|-----|---|---------------------------------------|
| 0  | 0  | mi | nor | орсо | de   |      |       |       |      |      |      |     |    |     |   | Shift, add, subtract, move, & compare |
| 0  | 1  | 0  | 0   | 0    | 0    | mi   | nor   | opcod | e    | Rm   | ı/Rr | 1   | R  | d/R | n | Data processing                       |
| 0  | 1  | 0  | 0   | 0    | 1    | mi   | nor   | opcod | e    | l    | Rm   |     | R  | d/R | n | Special data instructions & branch    |
| 0  | 1  | 0  | 0   | 1    | x    | mi   | nor   | opcod | e    |      |      |     |    |     |   | Load from Literal Pool                |
| 0  | 1  | 0  | 1   | x    | x    |      |       |       |      |      |      |     |    |     |   | Load/store single data item           |
| 0  | 1  | 1  | x   | x    | x    |      |       |       |      |      |      |     |    |     |   | Load/store single data item           |
| 1  | 0  | 0  | x   | x    | x    |      |       |       |      |      |      |     |    |     |   | Load/store single data item           |
| 1  | 0  | 1  | 0   | 0    |      | Rd   |       |       |      | ir   | nm8  |     |    |     |   | Generate PC-relative address          |
| 1  | 0  | 1  | 0   | 1    |      | Rd   |       |       |      | ir   | nm8  |     |    |     |   | Generate SP-relative address          |
| 1  | 0  | 1  | 1   |      |      | mino | or op | code  |      |      |      |     |    |     |   | Miscellaneous 16-bit instructions     |
| 1  | 1  | 0  | 0   | 0    |      | Rd   |       |       | re   | gist | er   | lis | st |     |   | Store multiple registers              |
| 1  | 1  | 0  | 0   | 1    |      | Rd   |       |       | re   | gist | er   | lis | st |     |   | Load multiple registers               |
| 1  | 1  | 0  | 1   | m    | inor | орсо | de    |       |      | offs | set  | -8  |    |     |   | Conditional branch, & supervisor call |
| 1  | 1  | 1  | 0   | 0    |      |      |       | off   | set- | -11  |      |     |    |     |   | Unconditional branch                  |

# Encoding: ORR r1, r0

| 15 | 14    | 13  | 12  | 11 | 10 | 9   | 8   | 7    | 6   | 5 | 4  | 3 | 2 | 1  | 0 |       |    |    |  |
|----|-------|-----|-----|----|----|-----|-----|------|-----|---|----|---|---|----|---|-------|----|----|--|
| 0  | 1     | 0   | 0   | 0  | 0  | 1   | 1   | 0    | 0   | 0 | 0  | 0 | 0 | 0  | 1 | ORR r | 1, | r0 |  |
|    | major |     |     |    |    |     |     |      |     |   |    |   |   |    |   |       |    |    |  |
|    |       | opc | ode |    |    | miı | nor | opco | ode |   | Rn |   |   | Rd |   |       |    |    |  |

0x4301

#### 32-bit Thumb Instructions

| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 23 | 22  | 2: | 1 26 | 19 | 18 17 | 16 | 15 | 14 | 13   | 12 | 11  | 10  | 9 8  | 7                       | 6 5   | 4   | 3                      | 2 1 0 |                                                         |
|----|----|----|----|----|----|----|-------|-----|----|------|----|-------|----|----|----|------|----|-----|-----|------|-------------------------|-------|-----|------------------------|-------|---------------------------------------------------------|
| 1  | 1  | 1  | 0  | 1  | 0  | 0  | ор    | 0   | h  | I L  |    | Rn    |    | x  |    | 0    |    |     |     | Regi | ist                     | er li | ist |                        |       | Load/store multiple                                     |
| 1  | 1  | 1  | 0  | 1  | 0  | 0  | op1   | 1   |    | op2  |    | Rn    |    | x  |    |      |    |     |     |      |                         | ор3   |     |                        |       | Load/store dual or exclusive, table branch              |
| 1  | 1  | 1  | 0  | 1  | 0  | 1  | (     | ор  |    | S    |    | Rn    |    | x  | j  | Lmm3 | 3  |     | Rd  |      | im                      | m2    |     |                        | Rm    | Data processing (shifted register)                      |
| 1  | 1  | 1  | 0  | 1  | 1  |    | o     | р1  |    |      |    |       |    | x  |    |      |    | C   | opr | oc   |                         |       | ор  |                        |       | Coprocessor instructions                                |
| 1  | 1  | 1  | 0  | 1  | x  | 0  |       | ор  |    |      |    | Rn    |    | 0  | i  | Lmm3 | 3  |     | Rd  |      |                         |       | imn | 18                     |       | Data processing (modified immediate)                    |
| 1  | 1  | 1  | 1  | 0  | X  | 1  |       | ор  |    |      |    | Rn    |    | 0  |    |      |    |     |     |      |                         |       |     |                        |       | Data processing (plain binary immediate)                |
| 1  | 1  | 1  | 1  | 0  |    |    | ор    |     |    |      |    |       |    | 1  |    | op1  |    |     |     |      |                         |       |     |                        |       | Branches and miscellaneous control                      |
| 1  | 1  | 1  | 1  | 1  | 0  | 0  | 0     | op1 | L  | 0    |    |       |    | X  |    |      |    |     | (   | p2   | p2                      |       |     | Store single data item |       |                                                         |
| 1  | 1  | 1  | 1  | 1  | 0  | 0  | op1   | 0   | e  | ) 1  |    | Rn    |    |    | R  | t    |    | op2 |     |      | Load byte, memory hints |       |     |                        |       |                                                         |
| 1  | 1  | 1  | 1  | 1  | 0  | 0  | op1   | 0   | 1  | . 1  |    | Rn    |    |    | R  | t    |    |     | C   | p2   |                         |       |     |                        |       | Load halfword, memory hints                             |
| 1  | 1  | 1  | 1  | 1  | 0  | 0  | op1   | 1   | e  | ) 1  |    | Rn    |    | x  |    |      |    |     | C   | pp2  |                         |       |     |                        |       | Load world                                              |
| 1  | 1  | 1  | 1  | 1  | 0  | 0  | x x   | 1   | 1  | . 1  |    |       |    | X  |    |      |    |     |     |      |                         |       |     |                        |       | Undefined                                               |
| 1  | 1  | 1  | 1  | 1  | 0  | 1  | 0     | 0   | р1 |      |    | Rn    |    | x  | 1  | 1    | 1  | 1   |     |      |                         | op2   |     |                        | Rm    | Data processing (register)                              |
| 1  | 1  | 1  | 1  | 1  | 0  | 1  | 1 0   |     | op | 1    |    |       |    | x  |    | R    | a  |     |     |      | 0                       | 0 c   | p2  |                        | Rm    | Multiply, multiply accumulate, and absolut e difference |
| 1  | 1  | 1  | 1  | 1  | 0  | 1  | 1 1   |     | op | 1    |    |       |    | x  |    |      |    |     |     |      |                         | op2   |     |                        | Rm    | Long multiply, long multiply accumulate, di vide        |
| 1  | 1  | 1  | 1  | 1  | 1  |    | o     | p1  |    |      |    |       |    | x  |    |      |    | C   | opr | ос   |                         |       | ор  |                        |       | Coprocessor instructions                                |

# Decoding: 0xF04F, 0x0003

|        | 15 | 14 | <b>1</b> 3 | 12             | 11  | 10                  | 9 | 8  | 7    | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
|--------|----|----|------------|----------------|-----|---------------------|---|----|------|---|---|---|---|---|---|---|
|        | 1  | 1  | 1          | 0.             | n 1 |                     |   | op | 2    |   |   | C |   |   |   |   |
|        | 1  | 1  | T          | U <sub>.</sub> | p1  | i                   |   |    |      |   |   | S |   |   |   |   |
| 0xF04F | 1  | 1  | 1          | 1              | 0   | 0                   | 0 | 0  | 0    | 1 | 0 | 0 | 1 | 1 | 1 | 1 |
|        |    |    |            |                |     |                     |   |    |      |   |   |   |   |   |   |   |
|        | 15 | 14 | 13         | 12             | 11  | 10                  | 9 | 8  | 7    | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
|        | op | i  | mm3        |                |     | R                   | d |    | imm8 |   |   |   |   |   |   |   |
| 0x0003 | 0  | 0  | 0 6        | )              |     | 0 0 0 0 0 0 0 0 1 1 |   |    |      |   |   |   |   |   |   |   |

MOV r0, #3

#### **ARM Instruction Code Format**



# ARM Load/Store Code Format





# Overview: Arithmetic and Logic Instructions

- Syntax<0peration>{<cond>}{S}Rd, Rn, Operand2
- Shift
  - LSL (logic shift left), LSR (logic shift right), ASR (arithmetic shift right),
     ROR (rotate right), RRX (rotate right with extend)
- Logic
  - AND (bitwise and), ORR (bitwise or), EOR (bitwise exclusive or),
     ORN (bitwise or not), MVN (move not)
- Bit set/clear
  - BFC (bit field clear), BFI (bit field insert), BIC (bit clear),
     CLZ (count leading zeroes)
- Bit/byte reordering
  - RBIT (reverse bit order in a word), REV (reverse byte order in a word),
     REV16 (reverse byte order in each half-word independently),
     REVSH (reverse byte order in each half-word independently)
- Addition
  - ADD, ADC (add with carry)

#### Overview: Arithmetic and Logic Instructions

- Subtraction
  - SUB, RSB (reverse subtract), SBC (subtract with carry)
- Multiplication
  - MUL (multiply), MLA (multiply-accumulate),
     MLS (multiply-subtract), SMULL (signed long multiply-accumulate),
     UMULL (unsigned long multiply-subtract),
     UMLAL (unsigned long multiply-subtract)
- Division
  - SDIV (signed), UDIV (unsigned)
- Saturation
  - SSAT (signed), USAT (unsigned)
- Sign extension
  - SXTB (signed), SXTH, UXTB, UXTH
- Bit field extract
  - SBFX (signed), UBFX (unsigned)

# Example: Add

- Unified Assembler Language (UAL) Syntax
  - A common syntax for ARM and Thumb instructinos

Traditional Thumb Syntax

```
ADD r1, r3 ; r1 = r1 + r3
ADD r1, #15 ; r1 = r1 + 15
```

# Commonly Used Arithmetic Operations

| ADD Rd, Rn, Op2                        | Add. Rd ← Rn + Op2                                 |  |  |  |  |  |
|----------------------------------------|----------------------------------------------------|--|--|--|--|--|
| ADC Rd, Rn, Op2                        | Add with carry. Rd ← Rn + Op2 + Carry              |  |  |  |  |  |
| SUB Rd, Rn, Op2                        | Subtract. Rd ← Rn - Op2                            |  |  |  |  |  |
| SBC Rd, Rn, Op2                        | Subtract with carry. Rd ← Rn - Op2 + Carry - 1     |  |  |  |  |  |
| RSB Rd, Rn, Op2                        | Reverse subtract. Rd ← Op2 - Rn                    |  |  |  |  |  |
| MUL Rd, Rn, Rm                         | Multiply. Rd ← (Rn × Rm)[31:0]                     |  |  |  |  |  |
| MLA Rd, Rn, Rm, Ra                     | Multiply with accumulate.                          |  |  |  |  |  |
| MLA RU, RII, RIII, RA                  | $Rd \leftarrow (Ra + (Rn \times Rm))[31:0]$        |  |  |  |  |  |
| MLS Rd, Rn, Rm, Ra                     | Multiply and subtract, Rd ← (Ra – (Rn × Rm))[31:0] |  |  |  |  |  |
| SDIV Rd, Rn, Rm                        | Signed divide. Rd ← Rn / Rm                        |  |  |  |  |  |
| UDIV Rd, Rn, Rm                        | Unsigned divide. Rd ← Rn / Rm                      |  |  |  |  |  |
| SSAT Rd, #n, Rm {,shift #s}            | Signed saturate                                    |  |  |  |  |  |
| <pre>USAT Rd, #n, Rm {,shift #s}</pre> | Unsigned saturate                                  |  |  |  |  |  |

# S: Set Condition Flags

```
start:
    LDR r0, =0xFFFFFFFF
    LDR r1, =0x00000001
    ADDS r0, r0, r1
stop: B stop
```

- For most instructions, we can add a suffix S to update the N, Z, C, V bit flags of the APSR register.
- In this example, the Z and C bits are set.

**APSR** 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

#### 64-bit Addition

- A register can only store 32 bits
- A 64-bit integer needs two registers
- Split 64-bit addition into two 32-bit additions



#### 64-bit Addition in Cortex M3 processor



#### 64-bit Subtraction



```
; Subtracting two 64-bit integers A (r1:r0) and B (r3:r2).
; C (r5:r4) = A (r1:r0) - B (r3:r2)
; A = 00000002FFFFFFFFF, B = 000000040000001
LDR r0, =0xFFFFFFFF ; A's lower 32 bits
LDR r1, =0x00000002 ; A's upper 32 bits
LDR r2, =0x00000001 ; B's lower 32 bits
LDR r3, =0x00000004 ; B's upper 32 bits

; Subtract A from B
SUBS r4, r0, r2 ; C[31:0] = A[31:0] - B[31:0], update Carry
SBC r5, r1, r3 ; C[64:32] = A[64:32] - B[64:32] + Carry - 1
```

### Short Multiplication

32 bit register

```
; MUL: Signed multiply
MUL r6, r4, r2 ; r6 = LSB32( r4 × r2 )

; UMUL: Unsigned multiply
UMUL r6, r4, r2 ; r6 = LSB32( r4 × r2 )

; MLA: Multiply with accumulation
MLA r6, r4, r1, r0 ; r6 = LSB32( r4 × r1 ) + r0

; MLS: Multiply with subtract
MLS r6, r4, r1, r0 ; r6 = LSB32( r4 × r1 ) - r0
```

# Long Multiplication

| UMULL RdLo, RdHi, Rn, Rm     | Unsigned long multiply. RdHi, RdLo ← unsigned(Rn × Rm) |
|------------------------------|--------------------------------------------------------|
| SMULL RdLo, RdHi, Rn, Rm     | Signed long multiply. RdHi, RdLo ← signed(Rn × Rm)     |
| UMLAL RdLo, RdHi, Rn, Rm     | Unsigned multiply with accumulate.                     |
| UMLAL KULO, KUIII, KII, KIII | RdHi, RdLo ← unsigned(RdHi,RdLo + Rn × Rm)             |
| CMI AI Ddi a Ddii; Dn Dm     | Signed multiply with accumulate.                       |
| SMLAL RdLo, RdHi, Rn, Rm     | RdHi, RdLo ← signed(RdHi,RdLo + Rn × Rm)               |

```
UMULL r3, r4, r0, r1 ; r4:r3 = r0 \times r1, r4 = MSB bits, r3 = LSB bits SMULL r3, r4, r0, r1 ; r4:r3 = r0 \times r1 UMLAL r3, r4, r0, r1 ; r4:r3 = r4:r3 + r0 \times r1 SMLAL r3, r4, r0, r1 ; r4:r3 = r4:r3 + r0 \times r1
```

# Bitwise Logic

| AND Rd, Rn, Op2                 | Bitwise logic AND. Rd ← Rn & operand2              |
|---------------------------------|----------------------------------------------------|
| ORR Rd, Rn, Op2                 | Bitwise logic OR. Rd ← Rn   operand2               |
| EOR Rd, Rn, Op2                 | Bitwise logic exclusive OR. Rd ← Rn ^ operand2     |
| ORN Rd, Rn, Op2                 | Bitwise logic NOT OR. Rd ← Rn   (NOT operand2)     |
| BIC Rd, Rn, Op2                 | Bit clear. Rd ← Rn & NOT operand2                  |
| BFC Rd, #lsb, #width            | Bit field clear. Rd[(width+lsb-1):lsb] ← 0         |
| DEI Dd Dn #lab #width           | Bit field insert.                                  |
| <b>BFI</b> Rd, Rn, #lsb, #width | $Rd[(width+lsb-1):lsb] \leftarrow Rn[(width-1):0]$ |
| MI/N Dd Om2                     | Move NOT, logically negate all bits.               |
| MVN Rd, Op2                     | Rd ← 0xFFFFFFF EOR Op2                             |

### Example: AND r2, r0, r1

Bit-wise Logic AND



### Example: ORR r2, r0, r1

• Bit-wise Logic OR



#### Example: BIC r2, r0, r1

- Bit Clear
  - r2 = r0 & NOT r1

#### Step 1:



#### Step 2:



#### BFC and BFI

- Bit Field Clear (BFC) and Bit Field Insert (BFI).
- Syntax
  - BFC Rd, #lsb, #width
  - BFI Rd, Rn, #lsb, #width

#### • Examples:

```
BFC R4, #8, #12; Clear bit 8 to bit 19 (12 bits) of R4 to 0
```

```
BFI R9, R2, #8, #12
```

; Replace bit 8 to bit 19 (12 bits) of R9 with bit 0 to bit 11 from R2.

Bit Operators (&, |, ~) *vs*Boolean Operators (&& , ||, !)

| A && B | Boolean and | A & B | Bitwise and |
|--------|-------------|-------|-------------|
| A  B   | Boolean or  | A B   | Bitwise or  |
| !B     | Boolean not | ~B    | Bitwise not |

- The Boolean operators perform word-wide operations, not bitwise.
- For example,
  - " $0 \times 10 \& 0 \times 01$ " =  $0 \times 00$ , but " $0 \times 10 \& \& 0 \times 01$ " =  $0 \times 01$ .
  - " $\sim 0 \times 01$ " =  $0 \times FFFFFFFE$ , but " $!0 \times 01$ " =  $0 \times 00$ .

#### Check a Bit in C

bit = 
$$a \& (1 << k)$$

• Example: k = 5

| а                                                                                                                            | a <sub>7</sub> | $a_6$ | <b>a</b> <sub>5</sub> | $a_4$ | $a_3$ | $a_2$ | $a_1$ | $a_0$ |
|------------------------------------------------------------------------------------------------------------------------------|----------------|-------|-----------------------|-------|-------|-------|-------|-------|
| 1 << k                                                                                                                       | 0              | 0     | 1                     | 0     | 0     | 0     | 0     | 0     |
| a & (1< <k)< th=""><th>0</th><th>0</th><th><b>a</b><sub>5</sub></th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></k)<> | 0              | 0     | <b>a</b> <sub>5</sub> | 0     | 0     | 0     | 0     | 0     |

#### Set a Bit in C

$$a = (1 << k)$$
or
 $a = a = (1 << k)$ 

• Example: k = 5

| a            | a <sub>7</sub> | $a_6$ | <b>a</b> <sub>5</sub> | $a_4$ | $a_3$ | a <sub>2</sub> | a <sub>1</sub> | $a_0$ |
|--------------|----------------|-------|-----------------------|-------|-------|----------------|----------------|-------|
| 1 << k       | 0              | 0     | 1                     | 0     | 0     | 0              | 0              | 0     |
| a   (1 << k) | a <sub>7</sub> | $a_6$ | 1                     | $a_4$ | $a_3$ | a <sub>2</sub> | a <sub>1</sub> | $a_0$ |

### Clear a Bit in C

a 
$$\&= \sim (1 << k)$$

• Example: k = 5

| а                                                                                                                                                                                                             | a <sub>7</sub> | $a_6$ | $\mathbf{a_5}$ | $a_4$ | $a_3$ | $a_2$ | $a_1$          | $a_0$ |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|-------|----------------|-------|-------|-------|----------------|-------|
| ~(1 << k)                                                                                                                                                                                                     | 1              | 1     | 0              | 1     | 1     | 1     | 1              | 1     |
| a & ~(1< <k)< th=""><th>a<sub>7</sub></th><th><math>a_6</math></th><th>0</th><th><math>a_4</math></th><th><math>a_3</math></th><th><math>a_2</math></th><th>a<sub>1</sub></th><th><math>a_0</math></th></k)<> | a <sub>7</sub> | $a_6$ | 0              | $a_4$ | $a_3$ | $a_2$ | a <sub>1</sub> | $a_0$ |

## Toggle a Bit in C

 Without knowing the initial value, a bit can be toggled by XORing it with a "1"

$$a^{1} = 1 < k$$

• Example: k = 5

| а                                                                                                                                                                                                                              | a <sub>7</sub> | $a_6$ | <b>a</b> <sub>5</sub> | $a_4$ | $a_3$ | $a_2$ | $a_1$ | $a_0$ |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|-------|-----------------------|-------|-------|-------|-------|-------|
| 1 << k                                                                                                                                                                                                                         | 0              | 0     | 1                     | 0     | 0     | 0     | 0     | 0     |
| a ^= 1< <k< th=""><th>a<sub>7</sub></th><th><math>a_6</math></th><th>NOT(a<sub>5</sub>)</th><th><math>a_4</math></th><th><math>a_3</math></th><th><math>a_2</math></th><th><math>a_1</math></th><th><math>a_0</math></th></k<> | a <sub>7</sub> | $a_6$ | NOT(a <sub>5</sub> )  | $a_4$ | $a_3$ | $a_2$ | $a_1$ | $a_0$ |

| $\mathbf{a}_5$ | 1 | a <sub>5</sub> ⊕1 |
|----------------|---|-------------------|
| 0              | 1 | 1                 |
| 1              | 1 | 0                 |

Truth table of Exclusive OR with one

# Saturating Instruction: SSAT and USAT

- Saturation is commonly used in signal processing—for example, in signal amplification.
- Syntax:
  - op{cond} Rd, #n, Rm{, shift}

• SSAT saturates a signed value to the signed range 
$$-2^{n-1} \le x \le 2^{n-1} - 1$$
. 
$$SAT(x) = \begin{cases} 2^{n-1} - 1 & \text{if } x > 2^{n-1} - 1 \\ -2^{n-1} & \text{if } x < 2^{n-1} \\ x & \text{otherwise} \end{cases}$$

• USAT saturates a signed value to the unsigned range  $0 \le x \le 2^n - 1$ .

$$USAT(x) = \begin{cases} 2^{n} - 1 & if \ x > 2^{n} - 1 \\ x & otherwise \end{cases}$$

• Examples:

```
■ SSAT r2, #11, r1 ; output range: -2^{10} \le r2 \le 2^{10}
■ USAT r2, #11, r3 ; output range: 0 \le r2 \le 2^{11}
```

# **Example of Saturation**

• Assume data are limited to **16** bits



• We can use these to change endianness

| RBIT Rd, Rn         | Reverse bit order in a word.                                   |  |  |
|---------------------|----------------------------------------------------------------|--|--|
| ,                   | for $(i = 0; i < 32; i++) Rd[i] \leftarrow RN[31-i]$           |  |  |
|                     | Reverse byte order in a word.                                  |  |  |
| <b>REV</b> Rd, Rn   | $Rd[31:24] \leftarrow Rn[7:0], Rd[23:16] \leftarrow Rn[15:8],$ |  |  |
|                     | $Rd[15:8] \leftarrow Rn[23:16], Rd[7:0] \leftarrow Rn[31:24]$  |  |  |
|                     | Reverse byte order in each half-word.                          |  |  |
| REV16 Rd, Rn        | Rd[15:8] ← Rn[7:0], Rd[7:0] ← Rn[15:8],                        |  |  |
|                     | Rd[31:24] ← Rn[23:16], Rd[23:16] ← Rn[31:24]                   |  |  |
|                     | Reverse byte order in bottom half-word and sign extend.        |  |  |
| <b>REVSH</b> Rd, Rn | Rd[15:8] ← Rn[7:0], Rd[7:0] ← Rn[15:8],                        |  |  |
|                     | Rd[31:16] ← Rn[7] & 0xFFFF                                     |  |  |

• RBIT Rd, Rn

Example

```
LDR r0, =0x12345678; r0 = 0x12345678
RBIT r1, r0; Reverse bits, r1 = 0x1E6A2C48
```

• REV Rd, Rn



#### • Example:

```
LDR R0, =0x12345678 ; R0 = 0x12345678 REV R1, R0 ; R1 = 0x78563412
```

• **REV16** Rd, Rn



• Example:

```
LDR R0, =0x12345678 ; R0 = 0x12345678 REV16 R2, R0 ; R2 = 0x34127856
```

• REVSH Rd, Rn



#### • Example:

```
LDR R0, =0x33448899 ; R0 = 0x33448899 REVSH R1, R0 ; R0 = 0xFFFF9988
```

### Sign and Zero Extension

```
int8_t a = -1;  // a signed 8-bit integer, a = 0xFF
int16_t b = -2;  // a signed 16-bit integer, b = 0xFFFE
int32_t c;  // a signed 32-bit integer

c = a;  // sign extension required, c = 0xFFFFFFFF
c = b;  // sign extension required, c = 0xFFFFFFE
```

### Sign and Zero Extension

| SXTB Rd, Rm {,ROR #n} | Sign extend a byte. Rd[31:0] ← Sign Extend((Rm ROR (8 × n))[7:0])       |
|-----------------------|-------------------------------------------------------------------------|
| SXTH Rd, Rm {,ROR #n} | Sign extend a half-word. Rd[31:0] ← Sign Extend((Rm ROR (8 × n))[15:0]) |
| UXTB Rd, Rm {,ROR #n} | Zero extend a byte. Rd[31:0] ← Zero Extend((Rm ROR (8 × n))[7:0])       |
| UXTH Rd, Rm {,ROR #n} | Zero extend a half-word. Rd[31:0] ← Zero Extend((Rm ROR (8 × n))[15:0]) |

```
LDR R0, =0x55AA8765

SXTB R1, R0 ; R1 = 0x00000065

SXTH R1, R0 ; R1 = 0xFFFF8765

UXTB R1, R0 ; R1 = 0x00000065

UXTH R1, R0 ; R1 = 0x000008765
```

### Move Data between Registers

| MOV Rd, Rn       | Rd ← operand2                                  |
|------------------|------------------------------------------------|
| MVN Rd, Rn       | Rd ← NOT operand2                              |
| MRS Rd, spec_reg | Move from special register to general register |
| MSR spec_reg, Rm | Move from general register to special register |

## Move Immediate Number to Register

| MOVW Rd, #imm16 | <b>Move Wide</b> , Rd ← #imm16 |                   |  |
|-----------------|--------------------------------|-------------------|--|
| MOVT Rd, #imm16 | Move Top,                      | Rd ← #imm16 << 16 |  |
| MOV Rd, #const  | Move,                          | Rd ← const        |  |

• Example: Load a 32-bit number into a register

```
MOVW r0, #0x4321 i r0 = 0x00004321

MOVT r0, #0x8765 ; r0 = 0x87654321

half word
destination register

Order does matter!
```

- MOVW will zero the upper halfword
- MOVT won't zero the lower halfword

```
MOVT r0, #0x8765 ; r0 = 0x8765xxxx
MOVW r0, #0x4321 ; r0 = 0x00004321
```

### **Barrel Shifter**



- The second operand of ALU has a special hardware called Barrel shifter
- Example:

```
ADD r1, r0, r0, LSL #3; r1 = r0 + r0 << 3 = 9 \times r0 
 Logical Shift Left
```

#### The Barrel Shifter

Logical Shift Left (LSL)



Logical Shift Right (LSR)



Arithmetic Shift Right (ASR)



Rotate Right (ROR)



Rotate Right Extended (RRX)



Rotate left can be replaced by a rotate right with a different rotate offset.

### **Barrel Shifter**

• Examples:

```
ADD r1, r0, r0, LSL #3
; r1 = r0 + r0 << 3 = r0 + 8 × r0

ADD r1, r0, r0, LSR #3
; r1 = r0 + r0 >> 3 = r0 + r0/8 (unsigned)

ADD r1, r0, r0, ASR #3
; r1 = r0 + r0 >> 3 = r0 + r0/8 (signed)
```

Use Barrel shifter to speed up the application

```
ADD r1, r0, r0, LSL #3 <=> MOV r2, #9 ; r2 = 9 MUL r1, r0, r2 ; r1 = r0 * 9
```